How Does Spark Calculate Retrieval Success Rate?

Created by	Patrick Woodhead
Created time	@October 10, 2024 8:47 PM
Last edited by	Patrick Woodhead
Last edited time	@October 14, 2024 3:55 PM
Tags

TL;DR

FIL Spark is a proof of retrievability protocol for verifying the retrievability of data stored with Filecoin Storage Providers. Very simply, Spark works by randomly sampling CIDs stored on Filecoin and then retrieving them. The results of whether files are retrievable or not are recorded and can be aggregated over to calculate the Spark retrieval success rate (RSR).

For Filecoin, at a network-wide view, the Spark RSR simply shows the percentage of retrieval attempts that succeeded. The data can then be aggregated by Storage Provider, by allocator or by client to show the RSR over files linked to these entities.

We will now go through the Spark protocol in more depth to show exactly how the Spark RSR values are created. We will provide links to more in-depth descriptions and discussions around each step of the protocol.

Spark Protocol

Deal Ingestion

The first step in the Spark protocol is to build a list of all files that should be available for “fast” retrieval. When we say fast, we mean that this file is stored unsealed by the Storage Provider so that it can be retrieved without needing to unseal the data.

Currently (October 2024), each week the Spark team runs a manual deal ingestion process (Github) that scans through all recently-made storage deals in the f05 storage market actor and translates these deals into retrieval tasks that it stores in an off-chain Spark database. A retrieval task is the tuple (CID, Storage Provider) , where the CID refers to a payload CID, as opposed to a piece CID or a deal CID. A payload CID is the root CID of some data like a file.

In the future, when Spark is compatible with DDO, there will be real-time deal ingestion into the Spark database when storage deals are made (Github). This will mean that new SPs will not need to wait a week to get a Spark score.

Committee Building

Each round of the Spark protocol is approximately 20 minutes. In each round, all of the online Spark checkers (i.e. currently people who are running Filecoin Station) are grouped deterministically at random into committees. The committees are designed to include between 40 and 100 checkers to make sure there are enough to protect against malicious actors, but not too many that we load test the Providers.

The members of each committee will go on to make the same retrieval checks in the round as we will see in the subsequent steps of the protocol. The committee will then come to an honest majority consensus about the result of the retrieval.

The idea behind having committees is that we can’t trust the retrieval result of one Spark checker but we can trust a committee to have an honest majority.

If we trusted the results of individual checkers then you can imagine an adversary who wants only a certain SP to prosper reporting positive results for that SP, and negative results for all other SPs. Committees are a protection against this sort of behaviour.

Committees also mean more retrieval tasks can be completed in each 20 minute round, compared to if all checkers did the same tasks. This prevents one Storage Provider from receiving too many of the same request at the same time.

Committees are built deterministically at random for each round using Drand. This protects against one party controlling an entire committee and dominating the honest majority decision.

Task Sampling

Task sampling refers to the step where committees are assigned the retrieval tasks that they should attempt in each round. Each committee is assigned a set of tasks per round. The exact number of tasks fluctuates such that the overall number of requests made by Spark in total per round is stable, so that we don’t load test the Providers.

A retrieval task is the tuple (CID, Storage Provider).

These tasks are chosen at random. This is to prevent Spark checkers from choosing their own tasks that benefit them, such as if an SP wants to run lots of tasks against itself. We dont yet use Drand for randomness here but we would like to introduce that to improve the end to end verifiability.

Only tasks that have been assigned in each round are used in the Spark RSR calculation, all others are ignored. This also means Spark checkers only get rewarded for the specified tasks each round.

For more information, see https://blog.filstation.app/posts/how-spark-samples-filecoin-deals.

Retrieval Requests

Once the Spark checkers have got hold of the list of retrieval tasks they must perform in the round, they begin to run their checks.

There is an in-depth discussion on how this part of the protocol works here: https://blog.filstation.app/posts/how-spark-retrieves-content-stored-on-filecoin

Here is a summary taken from the blog post:

SPARK’s retrieval test of (CID, minerID) performs the following steps:

Call Filecoin RPC API method Filecoin.StateMinerInfo to map minerID to PeerID.

Call https://cid.contact/cid/{CID} to obtain all retrieval providers.

Filter the response to find the provider identified by PeerID found in step 1 and obtain the address where this provider serves retrievals.

Retrieve the root block of the content identified by CID from that address using Graphsync or IPFS Trustless Gateway protocol. It uses the protocol advertised to IPNI for the retrieval. If both protocols are advertised, then it chooses HTTP.

Verify that the received block matches the CID.

Reporting Measurements to Spark-Publish

When the Spark Checkers have completed a retrieval and determined whether or not the CID is retrievable by following the above steps, they report their result, which we call a measurement to the Spark-publish service. Spark-publish injests these measurements, batch them into chunks of up to 100k measurements, uploads each chunk to web.storage (now Storacha), and commits the batch CID to the smart contract.

Evaluating Measurements with Spark-Evaluate

Spark-Evaluate is the Spark service that evaluates each measurement to decide whether or not it is valid. It listens out for on chain events that indicate that Spark Publish has posted a commitment on chain. It then takes the CID of the commitment and fetches the measurements from Storacha.

Once it has got hold of the measurements, it does fraud detection to remove all unwanted measurements. It removes those which are not supposed to be in the round, those submitted by are not from a member of the right committee, or those that are replicas from the same IPv4 /24 subnet. It then performs honest majority consensus - it calculates what the honest majority result is for each task in each committee.

On-chain Evaluation and Rewards

There are two final steps in the Spark protocol that pertain to how Spark checkers (Station Operators) get rewarded for their endeavours. However, this is not important for this page where we want to know how Spark RSR is calculated. You can find more details about these steps at https://docs.filspark.com

Spark RSR Calculations

High level Spark RSR Figure

At this point of the protocol, we have a set of valid measurements for each round stored off chain and committed to on chain for verifiability. Over many rounds, we have a continuous stream of measurements being stored off chain and committed to on chain. From this we can calculate the Spark RSR values.

Given a specific timeframe, the top level Spark RSR figure is calculated by taking the number of all successful retrievals made in that time frame and dividing it by the number of all retrievals made in that time frame.

RSR = {\# successful \over \# successful + \# failure}

However, as you may notice, we have not used the honest majority consensus results in this calculation. Here we are counting over all valid (i.e. non-fraudulent) requests. This is because there is a slight problem here when it comes to using the committee consensus results.

To give an example to explain this, let’s say that a Storage provider only serves 60% of retrieval requests. Assuming that all Spark checkers are acting honestly, when the Spark checkers make their checks, 60% of each committee reports that the CID is retrievable from the SP, while 40% report that is is unretrievable. With honest majority consensus, this file is deemed to be retrievable. If we use the results of the honest majority consensus rather than the raw measurements, we lose some fidelity in the retrievability of data from this SP. Specifically, instead of reporting a Spark RSR of 60%, we report a spark RSR of 100%, which seems misleading.

We believe that the 60% value is more accurate, yet we also need committees to prevent fraudulent behaviour. Currently, we are storing both the committee based score as well as the raw measurement score and we plan to use the committee results as a reputation score by which to weight the measurements from checkers in a committee.

Spark RSR by SP

To calculate a Spark RSR for an SP in a given timeframe:

we take all retrieval tasks that are linked to a given minerID in the timeframe.

We count the number of total requests made that have passed the fraud detection checks

We count how many of these requests were successful

We divide the number of successful request by the total number of requests.

Spark RSR by Client

TODO

Spark RSR by Allocator

TODO